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Abstract. This paper is concerned with the numerical minimization of energy functionals in Hilbert spaces 
involving convex constraints coinciding with a semi-norm for a subspace. The optimization is realized by alternating 
minimizations of the functional on a sequence of orthogonal subspaces. On each subspace an iterative proximity- map 
algorithm is implemented via oblique thresholding, which is the main new tool introduced in this work. We provide 
convergence conditions for the algorithm in order to compute minimizers of the target energy. Analogous results 
are derived for a parallel variant of the algorithm. Applications are presented in domain decomposition methods for 
singular elliptic PDE's arising in total variation minimization and in accelerated sparse recovery algorithms based 
on £i-minimization. We include numerical examples which show efficient solutions to classical problems in signal 
and image processing. 
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1. Introduction. Let 7i be a real separable Hilbert space. We are interested in the numerical 
minimization in TL of the general form of functionals 



where T G C{T-C) is a bounded linear operator, (7 G 7i is a datum, and a > is a fixed constant. 
The function ip : TC ^ M+ U {+00} is a semi-norm for a suitable subspace TC^ of H. In particular, 
we investigate splittings into arbitrary orthogonal subspaces 7i = Vi © V2 for which we may have 



where Try. is the orthogonal projection onto Vi. With this splitting wc want to minimize by 
suitable instances of the following alternating algorithm: Pick an initial Vi © V2 9 uf'' + u^'' := 
w*-"' G 7i*, for example u'"' ~ 0, and iterate 



This algorithm is implemented by solving the subspace minimizations via an oblique thresholding 
iteration. Wc provide a detailed analysis of the convergence properties of this sequential algorithm 
and of its modification for parallel computation. We motivate this rather general approach by two 
relevant applications in domain decomposition methods for total variation minimization and in 
accelerated sparse recovery algorithms based on ^i-minimization. Nevertheless, the applicability 
of our results reaches far beyond these particular examples. 

1.1. Domain decomposition methods for singular elliptic PDE's. Domain decompo- 
sition methods were introduced as techniques for solving partial differential equations based on 
a decomposition of the spatial domain of the problem into several subdomains [33l [7l |47l [TH [36l 
l48l [321 El • The initial equation restricted to the subdomains defines a sequence of new local 
problems. The main goal is to solve the initial equation via the solution of the local problems. 
This procedure induces a dimension reduction which is the major responsible of the success of 
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such a method. Indeed, one of the principal motivations is the formulation of solvers which can 
be easily parallelized. 

We apply the theory and the algorithms developed in this paper to adapt domain decompositions 
to the minimization of functionals with total variation constraints. Differently from situations clas- 
sically encountered in domain decomposition methods for nonsingular PDE's, where solutions are 
usually supposed at least continuous, in our case the interesting solutions may be discontinuous, 
e.g., along curves in 2D. These discontinuities may cross the interfaces of the domain decomposition 
patches. Hence, the crucial difficulty is the correct treatment of interfaces, with the preservation 
of crossing discontinuities and the correct matching where the solution is continuous instead. We 
consider the minimization of the functional J in the following setting: Let C M'', for d= 1,2, 
be a bounded open set with Lipschitz boundary. We are interested in the case when H = L'^{^), 
Ti,^ = BV{^1) and ip{u) = \Du\(^n), the variation of u. Then a domain decomposition O = ili U5l2 
induces the space splitting into Vi := {u € L^(r2) : supp(w) C ili}, i = 1,2. Hence, by means of 
the proposed alternating algorithm, we want to minimize the functional 

J{u) := \\Tu-g\\l.^,,)+2a\Du\in). 

The minimization of energies with total variation constraints traces back to the first uses of such 
a functional model in noise removal in digital images as proposed by Rudin, Osher, and Fatemi 
[39]. There the operator T is just the identity. Extensions to more general operators T and 
numerical methods for the minimization of the functional appeared later in several important 
contributions [HI [22l [3l [45l [13] . From these pioneering and very successful results, the scientific 
output related to total variation minimization and its applications in signal and image processing 
increased dramatically in the last decade. It is not worth here to mention all the possible directions 
and contributions. We limit ourself to mention that, to our knowledge, this paper is the first in 
presenting a successful domain decomposition approach to total variation minimization. The 
motivation is that several approaches are directed to the solution of the Euler-Lagrange equations 
associated to the functional J^, which determine a singular elliptic PDE involving the 1-Laplace 
operator. Due to the fact that |Z?w|(0) is not differentiable, one has to discretize its subdifferential, 
and its characterization is indeed hard to implement numerically in a correct way. The lack of 
a simple characterization of the subdifferential of the total variation especially raises significant 
difficulties in dealing with discontinuous interfaces between patches of a domain decomposition. 
Our approach overcomes these difficulties by minimizing the functional via an iterative proximity- 
map algorithm, as proposed, e.g., in [13], instead of attempting the direct solution of the Euler- 
Lagrange equations. It is also worth to mention that, due to the generality of our setting, our 
approach can be extended to more general subspace decompositions, not only those arising from 
a domain splitting. This can open room to more sophisticated multiscalc algorithms where Vi are 
multilevel spaces, e.g., from a wavelet decomposition. 

1.2. Accelerated sparse recovery algorithms based on £i-minimization. In this ap- 
plication, we are concerned with the use of the alternating algorithm to the case where A is a 
countable index set, 7i = ^2(A), and ip{u) = ||^||^i(A) •= SagaI^-'^I- "^^^ minimization of the 
functional 

J{u) := \\Tu - .g||f,(A) + '2a\\u\\e„ 

proved to be an extremely efficient alternative to the well-known Tikhonov regularization |27| . 
whenever 

Tu = g, 

is an ill-posed problem and the solution u is expected to be a vector with a moderate number of 
nonzero entries. Indeed, the imposition of the ^i-constraint does promote a sparse solution. The 
use of the £i norm as a sparsity-promoting functional can be found first in reflection seismology 
and in deconvolution of seismic traces [l7l [40l [42] . In the last decade more understanding of the 
deep motivations why ^i-minimization tends to promote sparse recovery was developed. Rigorous 
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results began to appear in the late-1980s, with Donoho and Stark j^S] and Donoho and Logan 
|24j . Applications for £i minimization in statistical estimation began in the mid-1990s with the 
introduction of the LASSO algorithm (iterative thresholding). In the signal processing com- 
munity, Basis Pursuit [16j was proposed in compression applications for extracting a sparse signal 
representation from highly overcomplete dictionaries. From these early steps the applications and 
understanding of £i minimization have continued to increase dramatically. It is now hard to trace 
all the relevant results and applications and it is beyond the scope of this paper. We shall address 
the interested reader to the review papers Pl fTUlH We may simply emphasize the importance 
of the study of ^i-minimization by saying that, due to the surprisingly effectiveness in several 
applications, it can be considered today as the "modern successor" of least squares. From this 
lapidary statement it follows the clear need for efficient algorithms for the minimization of J. An 
iterative thresholding algorithm was proposed for this task [THl HH [HI HB US] • We refer also to the 
recent developments |30[ 131] . Unfortunately, despite its simplicity which makes it very attractive 
to users, this algorithm does not perform very well. For this reason, together with other accelera- 
tion methods, e.g., [20] . a "domain decomposition" algorithm was proposed in |29| . and we proved 
its effectiveness in accelerating the convergence and we provided its parallelization. There the 
domain is the label set A which is disjointly decomposed A = Ai U A2. This decomposition induces 
an orthogonal splitting of £2(A) into the subspaces Vi = £2^ (A) := {u £ ^2(A) : supp(w) C Aj}, 
i = 1,2. In this paper we investigate the application of the alternating algorithm to more gen- 
eral orthogonal subspace decompositions and we discuss how the choice can influence convergence 
properties and speed-up. Again the generality of our approach allows to experiment several possi- 
ble decompositions, but we limit ourself to present some key numerical examples in specific cases 
which help to highlight the properties, i.e., virtues and limitations, of the algorithm. 

1.3. Content of the paper. In section 2 we illustrate the general assumptions on the convex 
constraint function and the subspace decompositions. In section 3 we formulate the minimization 
problem and motivate the use of the alternating subspace correction algorithm. With section 4 
we start the construction of the algorithmic approach to the minimization, introducing the novel 
concept of oblique thresholding^ computed via a generalized Lagrange multiplier. In section 5 
we investigate convergence properties of the alternating algorithm, presenting sufficient conditions 
which allow it to converge to minimizers of the target functional J7. The same results are presented 
in section 6 for a parallel variant of the algorithm. Section 7 is dedicated to applications and 
numerical experiments in domain decomposition methods for total variation minimization in ID 
and 2D problems, and in accelerations of convergence for £1— minimization. 

2. Preliminary Assumptions. We begin this section with a short description of the generic 
notations used in this paper. 

In the following 7i is a real separable Hilbert space endowed with the norm || • ||-^. For some 
countable index set A we denote by £p = £p{A), I < p < 00, the space of real sequences u = {ux)xi£A 
with norm 



\AeA / 

and ||it||oo ■= s^PxeA I'^aI usual. If (vx) is a sequence of positive weights then we define the 
weighted spaces £p^y = £p^y{A) = {u, {uxvx) S £p(A)} with norm 



(with the standard modification for p — 00). The Euclidean space is denoted by R*^ endowed with 




1 < P < CO 




the Euclidean norm, but we will also use the M-dimensional space 
^g-norm. By we denote the non-negative real numbers. 



£^\ i.e., R'^' endowed with the 



^The reader can also find a sufficiently comprehensive collection of the ongoing recent developments at the 
web-site |http://www. dsp. ece.rice.edu/cs/ 1 
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Usually C M'' will denote an open bounded set with Lipschitz boundary. The symbol LP{il) 
denotes the usual Lebesgue space of p-summable functions, C'^(r2) is the space of functions k- 
times continuously differentiable, and BV{fl) the space of functions with bounded variation. For 
a topological vector space V we denote V its topological dual. Depending on the context, the 
symbol ~ may define an equivalence of norms or an isomorphism of spaces or sets. The symbol 
In denotes the characteristic function of the set D.. 

More specific notations will be defined in the paper, where they turn out to be useful. 

2.1. The convex constraint function ijj. We are given a function tp : H ^ M+ U {+00} 
with the following properties: 

m = 0; 

('^2) ip is sublinear, i.e., ip{u + v) < ip{u) + ip{v) for all u,v ^ H; 
(^'3) ip is 1-homogeneous, i.e., ip{\u) = \X\ip{u) for all A G R. 

(^'4) Ip is lower-semincontinuous in 7i, i.e., for any converging sequence u„ — > it in 7i 

T16N 

Associated with ip we assume that there exists a dense subspace T-C^' C 7i for which "iAIw' is a 
seminorm and T-C^' endowed with the norm 

is a Banach space. We do not assume instead that is reflexive in general; note that due to the 
dense embedding C H we have 

and the duality (•, ■)(ji^yxn^ extends the scalar product on 7i. In particular, Ti. is weakly-*-dense 
in {T-C^y . In the following we require 

(HI) bounded subsets in Ti^ are sequentially bounded in another topology r''' of T-&; 

{H2) tp is lower-semicontinuous with respect to the topology r'^; 
In practice, we will always require also that 

(iJ3) H'f' ^{ueH: ij{u) < 00}. 
We list in the following the specific examples we consider in this paper. 

Examples 2.1. 1. Let f2 C K'*, for d= 1,2 be a bounded open set with Lipschitz boundary, 
and n = L'^{Vl). We recall that for u e L\^^{Vl) 

v{u,n) -supj^ uAw^ dx-.ipG [cl(^^)]^|l(^|loo < i| 

is the variation of u and that u G BV{^) (the space of bounded variation functions, \28l ) if 
and only ifV{u,fl) < 00, see fJl Proposition 3.6]. In such a case, \D[u)\{^) = V{u,n), where 
\D{u)\{^) is the total variation of the finite Radon measure Du, the derivative of u in the sense of 
distributions. Thus, we define ip{u) = V{u, ft) and it is immediate to see that TC^ must coincide 
with BV(VL). Due to the embedding L'^iVl) C L^{Q?) and the Sobolev embedding |IJ Theorem 3.4-7] 
we have 

WAn-^ - IMI2 + V{u,n) + \Du\{n) = \\u\\bv- 

Hence {TC^' , \\ ■ is indeed a Banach space. It is known that V{-,Vl) is lower-semincontinuous 

with respect to L^(f2) |I1 Proposition 3.6]. We say that a sequence {un)n in BV{^) converges to 
u G BV{^) with the weak-* -topology if (it„)„ converges to u in and Dun converges to Du 

with the weak-* -topology in the sense of the finite Randon measures. Bounded sets in BV{fl) are 
sequentially weakly-* -compact (J^ Proposition 3.13]), and V{-,fl) is lower-semicontinuous with 
respect to the weak-* -topology. 
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2. Let A be a countable index set and Ti ~ ^2(A). For a strictly positive sequence w ~ (il'a)agA; 

1. e., w\ > Wq > 0, we define ipi"^) = „(A) EagA ^'•^I'^-^l- '^^^ space Ti^ simply coincides 
with ^1 j„(A). Observe that bounded sets in TC^ are sequentially weakly compact in TL and that, by 
Fatou's lemma, is lower- semicontinuous with respect to both strong and weak topologies ofTi. 

3. Let TL = 'EJ^ endowed with the Euclidean norm, and Q : — > R", for n < N, is a fixed 
linear operator. We define ip{u) = ||(5u||£". Clearly Ti."^' = M.^ and all the requested properties 
are trivially fulfilled. One particular example of this finite dimensional situation is associated with 
the choice of Q : M.^ — > R-'^^-^ given by Q{u)i := N{ui^i — Ui), i = 0, . . . , iV — 2. In this case 

= IIQmII^jv-i is the discrete variation of the vector u and the model can be seen as a discrete 
approximation to the situation encountered in the first example, by discrete sampling and finite 
differences, i.e., setting Ui :— u{-^) and u e BV{Q, 1). 

2.2. Bounded subspace decompositions. In the following wc will consider orthogonal 
decompositions of Ti into closed subspaces. We will also require that such a splitting is bounded 

in rc^\ 

Assume that Vi, V2 are two closed, mutually orthogonal, and complementary subspaces of Ti, i.e., 
Ti = Vi ® V2, and Try. are the corresponding orthogonal projections, for i = 1,2. Moreover we 
require the mapping property 

Try : ^ V^"^ := nV„ i = 1, 2, 

continuously in the norm of , and Range(7ry. |^^) = V^'^ is closed. This implies that Ti,'^' splits 
into the direct sum K'^' = © . 

Examples 2.2. 1. Let fix C SI C R'', for d = 1,2, be two bounded open sets with Lipschitz 
boundaries, and = SI \ Oi . We define 

V, := {u e L'^{n) : supp('u) C Vl,}, i = 1,2. 

Then 7ry(M) = ulo.. For i:{u) = V{u,n), by Corollary 3.89], = BV{n) fW, is a closed 
subspace of BV{n) and Try. (u) = uIq- G , i = 1,2, for all u 6 BViVl). 

2. Similar decompositions can be considered for the examples where Ti. = ^2(A) and^p{u) = \\u\\ii 
see, e.g., \29^ . and TL = R^ and ip{u) = \\Qu\\e^^ . 

3. A Convex Variational Problem and Subspace Splitting. We are interested in the 
minimization in TC (actually in Ti"^') of the functional 

J{u) := \\Tu-g\\j^ + 2aij{u), 

where T G C{T-L) is a bounded linear operator, 5 G 7i is a datum, and a > is a fixed constant. 
In order to guarantee the existence of its minimizers we assume that: 

(C) J is coercive in TL, i.e., {J < C} := {u e H : J{u) < C} is bounded in H. 

Examples 3.1. 1. Assume Q C R'^, for d = 1,2 be a bounded open set with Lipschitz 
boundary, Ti = L^(S7) and tp{u) = V^(m, S7) (compare Examples \2.1\ 1). In this case we deal with 
total variation minimization. It is well-known that if Tin ^ then condition (C) is indeed 
satisfied, see \45\ Proposition 3.1] and fT^. 

2. Let A be a countable index set and Ti = A) . For a strictly positive sequence w = (wA)AgA; 
i.e., w\ > Wo > 0, we define ip{u) = \\u\\i-^ := X^asA '"'-^l^^l (compare with Examples \2. 1\ 2) . 
In this case condition (C) is trivially satisfied since J{u) > 2a'ip{u) ~ 2a\\u\\(-^ ^^^(^/^^ > 7||it||£,(A), 
for 7 = 2awQ > 0. 

Lemma 3.2. Under the assumptions above, J has minimizers in Ti^ . 

Proof. The proof is a standard application of the direct method of calculus of variations. Let 
{un)n CH, a. minimizing sequence. By assumption (C) we have ||un||-H + '0(un) < C for all n G N. 
Therefore by (HI) we can extract a subsequence in TC^ converging in the topology t"^' . Possibly 
passing to a further subsequence we can assume that it also converges weakly in Ji. By lower- 
semicontinuity of ||ru — with respect to the weak topology of Ji and the lower-semicontinuity 
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of ip with respect to the topology t'^ , ensured by assumption (H2), we have the wanted existence 
of minimizers. □ 

The minimization of ^7 is a classical problem |26] which was recently rc-considcred by several 
authors, [13l [18l [HI [211 [111 US], with emphasis on the computability of minimizers in particular 
cases. They studied essentially the same algorithm for the minimization. 
For ip with properties {'^1 — ^'4), there exists a closed convex set K^i, C H such that 

tp* (u) — sup{{v,u) — ^(v)} 

v<£H 



ifu e 
+00 otherwise. 



See also Examples 14.21 2 below. In the following we assume furthermore that = —K^. For any 
closed convex set C 7i we denote Pk(u) = argmin^g^ \\u — vWu the orthogonal projection onto 
K. For S^' := I — PaK^, called the generalized thresholding map in the signal processing literature, 
the iteration 



u 



in+i) ^ §V'(y(«) + T*{g - Tu^"))) (3.1) 



converges weakly to a minimizer u G Ti^ of J, for any initial choice u^^^ G Ti^' , provided T 
and g are suitably rescaled so that 1|T|| < 1. For particular situations, e.g., Ti. = ^2(A) and 
ipiu) — \\u\\e-^ ^, one can prove the convergence in norm [19[ I21j . 



As it is pointed out, for example in [20|,l29j. this algorithm converges with a poor rate, unless 
T is non-singular or has special additional spectral properties. For this reason accelerations by 
means of projected steepest descent iterations [50] and domain decomposition methods [15] were 
proposed. 

The particular situation considered in |29| isTi — ^2(A) and ip{u) = \\u\\(^^jyy In this case one 
takes advantage of the fact that for a disjoint partition of the index set A = Ai U A2 we have the 
splitting ipi^Ai + uj^^) = '0("Ai) + V'(^A2) for vector UAi supported on A^, i = 1, 2. Thus, a 
decomposition into column subspaces (i.e., componentwise) of the operator T (if identified with a 
suitable matrix) is realized, and alternating minimizations on these subspaces are performed by 
means of iterations of the type (|3.ip . This leads, e.g., to the following sequential algorithm: Pick 
an initial m^i^^ + '"aJ*'^^ •= ""^^^ G ^i(A), for example u^"^ = 0, and iterate 

r (n+l.O) (n.L) 



(n+lj+l) 



(4r^"' + TLM9 - TA,4r^'"^) - Ta.-'C''')) ^ = 0, . 



(3.2) 

Here the operator Sq is the soft-thresholding operator which acts componentwise SqU = {SaV\)\^A 
and defined by 

, , / a; - sign(a;)a, |a;| > a , . 

•^"^^^ " \ 0, otherwise. ^'^"^^ 

The expected benefit from this approach is twofold: 

1. Instead of solving one large problem with many iteration steps, we can solve approxima- 
tively several smaller subproblems, which might lead to an acceleration of convergence and 
a reduction of the overall computational effort, due to possible conditioning improvements; 

2. The subproblems do not need more sophisticated algorithms, simply reproduce at smaller 
dimension the original problem, and they can be solved in parallel. 
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The nice splitting of ip as a sum of evaluations on subspaces does not occur, for instance, when 
TC ~ L^(ri), ip{u) = V{u,n) = |_Du|(ri), and ili U C f2 C fJi U is a disjoint decomposition 
of n. Indeed, cf. [H Theorem 3.84], we have 

\Diun, + un,)m) ^ \Dun,mi) + \Dun,m2) + [ \u+^ix) ~ u^^{x)\dni{x) . (3.4) 

additional interface term 

Here one should not confuse Tid with any Ti.^ since the former indicates the Hausdorff mea- 
sure of dimension d. The symbols and denote the left and right approximated limits 
at jump points [TJ Proposition 3.69]. The presence of the additional boundary interface term 
/af2inao2 I^Oil-^) ~ ^n2(^)l'^'^i('^) does not allow to use in a straightforward way iterations as in 
p.ip to minimize the local problems on Qi. 

Moreover, also in the sequence space setting mentioned above, the hope for a better conditioning 
by column subspace splitting as in might be ill-posed, no such splitting needs to be well con- 
ditioned in general (good cases are provided in [33] instead). 

Therefore, one may want to consider arbitrary subspace decompositions and, in order to deal 
with these more general situations, we investigate splittings into arbitrary orthogonal subspaces 
7i = Vi © V2 for which we may have 

ipiiTvAu) +Trv2{v)) 7^ tPiT^Viiu)) +^'(7ry,(?;)). 

In principal, in this paper we limit ourself to consider the detailed analysis for two subspaces 
Vi,V2- Nevertheless, the arguments can be easily generalized to multiple subspaces Vi, . . . , V^, 
see, e.g., [29] . and in the numerical experiments we will also test this more general situation. 

With this splitting we want to minimize by suitable instances of the following alternating 
algorithm: Pick an initial Vi © V2 9 u^'' + 1*2° := m^^-* € 7i*, for example w*-"^ = 0, and iterate 

u\ « argmm„^gy^ J{vi +uy} 
< 4"+^)«argmin,^,y^ J(4"+^)+z;2) (3.5) 

We use (the approximation symbol) because in practice we never perform the exact minimiza- 
tion, as it occurred in (|3.2p . In the following section we discuss how to realize the approximation 
to the individual subspace minimizations. As pointed out above, this cannot just reduce to a 
simple iteration of the type (|3.ip . 

4. Local Minimization by Lagrange Multipliers. Let us consider, for example, 

argmin^^gy^ J{vi + U2) = argmin^^^y^ ||Tt;i - (g - Tu2)\\li + 2aV'(«i + "2)- (4.1) 

First of all, observe that {u G 7i : Try^u — U2, J{u) < C} C {J < C}, hence the former set is also 
bounded by assumption (C). By the same argument as in Lemma [221 the minimization (|4.ip has 
solutions. It is useful to us to introduce an auxiliary functional J7f*, called the surrogate Junctional 
of J: Assume a,ui G V\ and U2 S V2 and define 

Jl(ux + U2,a) ■= J{ui + U2) + \\ui - a\\l^ - \\T{ui - a)|||^. (4.2) 

A straightforward computation shows that 

Jl{ui + U2, a) ^ \\ui - (a + nviT*{g - Tu2 - Ta))\\\^ + 2a'i/'(wi + U2) + ^ia,g, U2), 

where $ is a function of a, <?, U2 only. We want to realize an approximate solution to (|4.ip by using 
the following algorithm: For u^"'' 6 , 

u['+^^ = argmin„^gy^ J^^m + U2, ^ I > 0. (4.3) 
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Before proving the convergence of this algorithm, we need to investigate first how to compute 
practicaUy u^"^^^ for uj""* given. To this end we need to introduce further notions and to reeah 
some useful results. 

4.1. Generalized Lagrange multipliers for nonsmooth objective functions. Let us 

begin this subsection with the notion of a subdifferential. 

Definition 4.1. For a locally convex space V and for a convex Junction F : V M.yj 
{ — cx), +oo}, we define the subdifferential of F at x £V, as dF{x) = if F{x) ~ oo, otherwise 

dF{x) := dFv{x) {x* e V' : {x*,y - x) + F{x) < F{y) Vy g V}, 

where V' denotes the dual space ofV. It is obvious from this definition that G dF{x) if and only 
if X is a minimizer of F. Since we deal with several spaces, namely, TC, TL^ , Vi, , it will turn out 
to he useful to distinguish sometimes in which space (and associated topology) the subdifferential 
is defined by imposing a subscript dyF for the subdifferential considered on the space V . 
Examples 4.2. 1. Let V = ^i(A) and F{x) := ||x||i is the £i-norm. We have 

d\\ ■ Ux) = e ^oo(A) : G e ai • \{xx),X e A} (4.4) 

where d\ ■ \{z) = {sign(z)} if z ^ and d\ ■ |(0) = [-1,1]. 

2. Assume V ^ 7i and ip > is a proper lower-semicontinuous convex function. For F{u; z) — 
\\u — z||^ + 2(p{u), we define the function 

prox^(2;) ~ argmin„gy F{u; z), 

which is called the proximity map in the convex analysis literature, e.g., \2b\ and generalized 
thresholding in the signal processing literature, e.g., \1!A \2(A \21l \29}/ . Observe that by cp > 
the function F is coercive in TL and by lower- semicontinuity and strict convexity of the term 
— z||^ this definition is well-posed. In particular, proX;^(z) is the unique solution of the following 
differential inclusion 

e (m - z) + dLp{u). 

It is well-known f26l \ 37}j that the proximity map is nonexpansive, i.e., 

II prox^(zi) - prox^(z2)||-« < ||zi - Z2\\n- 

In particular, if ip is a 1-homogeneous function then 

prox^(z) = (/ - PkJ{z), 

where is a suitable closed convex set associated to ip, see for instance U8}j . 
Under the notations of Definition 14.11 we consider the following problem 

aTgmm^^y{F{x) : G{x) = 0}, (4.5) 

where G : ^ K is a bounded linear operator on V. We have the following useful result. 

Theorem 4.3 (Generalized Lagrange multipliers for nonsmooth objective functions, Theorem 
1.8, [5]). If F is continuous in a point of kei G and G has closed range in V , thenapointxo SkerG 
is an optimal solution of |^.5[ ) if and only if 



9F(a;o) n Range G* ^ 0. 



SUBSPACE CORRECTION METHODS FOR OPTIMIZATION 



9 



4.2. Oblique thresholding. We want to exploit Theorem 14.31 in order to produce an algo- 
rithmic solution to each iteration step (|4.3p . 

Theorem 4.4 (Oblique thresholding). For U2 £ V2' and for z G Vi the following statements 
are equivalent: 

(i) ul = argmin„gy^ \\u - + 2a-0(ii + U2); 

(ii) there exists -q £ Range(7ry2 Iw*)* — {y})' such that G w* — (z — 77) + a9^*-0(u* + ^2)- 
Moreover, the following statements are equivalent and imply (i) and (ii). 

(Hi) there exists rj g V2 such that u\ = (/ — PaK^){z + U2 — rj) — U2 = §q + ^2 — rj) — U2 G Vi; 
(iv) there exists rj V2 such that i] = TTy^ P^i^^ (77 — (z + U2)). 

Proof. Let us show the equivalence between (i) and (ii). The problem in (i) can be reformulated 

as 

u\ = argmin„g^^,{F(u) := \\u - zj|^ + 2ai/'(it + U2), TTy^ (u) = 0}. 

The latter is a special instance of (|4.5p . Moreover, F is continuous on dVi — kcrTry^ in the 
norm-topology of (while in general it is not on Vi with the norm topology of Ti). Recall now 
that TTVj is assumed to be a bounded and surjective map with closed range in the norm-topology 
of Ti^' (see Section [^^ . This means that (ttvsI-^^)* is injectivc and that Range ( Try^ j-^v)* — (^2^)' 
is closed. Therefore, by an application of Theorem 14.31 the optimality of u\ is equivalent to the 
existence of 77 G Range(7rv2 — {^2)' ^VlcYy that 

-r, G dni^F{ul). 

Due to the continuity of ||m — in TC^ ^ we have, by [26l Proposition 5.6], that 

du^F{u\) = 2{ul - z) + 2adn^i!{ul + U2). 

Thus, the optimality of u\ is equivalent to 

Q <E u\ - {z - vj) + adu^tp{iL[ + U2). 

This concludes the equivalence of (i) and (ii). Let us show now that (iii) implies (ii). The condition 
in (iii) can be rewritten as 

^= (I - PaK^){z + U2-r]), ^ = ul+U2. 

Since ip >0 is 1-homogeneous and lower-semincontinuous, by Examples 14.21 2. the latter is equiv- 
alent to 

G C - (z + M2 - ?/) + adn^l^iO 

or, by (H3), 

? = argmin^g.^ \\u - {z + U2 - 77) ||^ + 2a'ip{u) 
= argmin^g.^V' \\u - (z + M2 - 77) 1|^ + 2a7/i(u) 

The latter optimal problem is equivalent to 

G ^ - (z + M2 - ?7) + q9^*V'(0 01' G - (z - 77) + ad-H'pi'iul + U2). 

Since V2 C {V2)' — Range(7rv2 I-h"/- )* we obtain that (iii) implies (ii). We prove now the equivalence 
between (iii) and (iv). We have 

Ul = (/ - PaK^){z + U2-ri) ~U2&Vi 
^ Z-rj- PaK^, (z + U2- 77). 
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By applying ttvs to both sides of the latter equality we get 

= -77 - TTV2 PaK^ {z + U2-r]). 
By recalling that — ^K^, we obtain the fixed point equation 

?7 = TTv^PaK^, (77 - (z + "2))- (4.6) 

Conversely, assume 77 = TTv2PaK^ ~ + ""2)) for some 77 e V2. Then 

(I - PaK^){z + U2 - r]) - U2 = Z - 71 - P^K^, {z + U2 - 1]) 

= Z - TTv^PaK^iri - (z + U2)) - PaK^iz + U2 - Tj) 

= Z - {I -TTV2 )PaK^ {z + U2- Tj) 

= 2 - TTy^ PaK^ (2 + 7*2 - 7;) = e . 

□ 

Remark 4.5. 1. Unfortunately in general we have V2 C {V^)' which excludes the complete 
equivalence of the previous conditions (i)-(iv). For example, in the case Ti. = ^2(A) and ^^{u) = 
\\u\\i.„ A = Ai U A2, = 4* (A) {u e 4(A) : supp(it) C AJ, i = 1,2, we have = £^^(A) 
{u e ii{A) : supp(7i) C A2}, hence, V2 C (y/)' ~ ^^^(A) ^ {u e ^oo(A) : supp(7t) C AJ. It can 
well be that 77 S ^^^(A) \ (A). However, since ip{u\-^ + u\^) = 'ip{u\-^) + ip{u\2) "in this case, 
we have G 7i^ — z + ad\\ ■ ||i(7ii) and therefore we may choose any 77 in d\\ ■ ||i(7i2). Following 
\29^ . U2 is assumed to be the result of soft-thresholded iterations, hence U2 is a finitely supported 
vector. Therefore, by Examples \4-.2\ 1, we can choose rj to be also a finitely supported vector, hence 
rj G ^2^(A) ~ V2. This means that the existence of 7] G V2 as in (Hi) or (iv) of the previous theorem 
may occur also in those cases for which V2 C {V^^Y ■ general, we can only observe that V2 is 
weakly-* -dense in {V,^)' ■ 

2. For Ti. with finite dimension - which is the relevant case in numerical applications - all the 
spaces are independent of the particular attached norm and coincide with their duals, hence all the 
statements (i)-(iv) of the previous theorem are equivalent in this case. 

A simple constructive test for the existence of 77 G V2 as in (iii) or (iv) of the previous theorem 
is provided by the following iterative algorithm: 

^(0) e V2, 77(™+i) = ^v.PcK^ iv^'^^ -{z + U2)), m>0. (4.7) 
Proposition 4.6. The following statements are equivalent: 

(i) there exists 77 G V2 such that rj = Trv2PaK^,{'t ~ ~^ ''^2)) (which is in turn the condition 

(iv) of Theorem \4.4\ ) 
(ii) the iteration 7| ) converges weakly to any 77 G V2 that satisfies |4.5[ ). 
In particular, there are no fixed points of |^.^ if and only if ||77(™) — > 00, for m ^ 00. 
For the proof of this Proposition we need to recall some classical notions and results. 
Definition 4.7. A nonexpansive map T : Ti. —i- Ti. is strongly nonexpansive if for {un — Vn)n 
bounded and ||T(m„) — T{vn)\\H ~ \\un ~ Vn\\H ^ we have 

Un-Vn- T{Un) - T{Vn) ^0, 77 ^ OO. 

Proposition 4.8 (Corollaries 1.3, 1.4, and 1.5 [§]). LetT -.Ti ^ Ti be a strongly nonexpansive 
map. Then FixT = {u £ Ti : T{u) = u} =^ 9 if and only if {T"-u)„ converges weakly to a fixed 
point uq G FixT for any choice of u G Ti. 

Proof. (ProDOsition l4.6| ) Orthogonal projections onto convex sets are strongly nonexpansive [8l 
Corollary 4.2.3]. Moreover, composition of strongly nonexpansive maps are strongly nonexpansive 
[SJ Lemma 2.1]. By an application of Proposition 14.81 we immediately have the result, since any 
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map of the type T(^) ~ Q{£,) + Co is strongly nonexpansive whenever Q is (this is a simple 
observation from the definition of strongly nonexpansive map). Indeed, we are looking for fixed 
points ofT] = TTy^PaK^ {v - {z + U2)) or, equivalently, of ^ Try^PaK^ (C) ~ (z + ^2)- 

□ 

In Examples 14.21 we have already observed that 

ul =prox„^,(._^„^)(2:). 

For consistency with the terminology of generalized thresholding in signal processing, we may call 
the map i)tox^^^,j^^^^ an oblique thresholding and we denote it by 

E,t'^''^Hz;u2) := prox„,^,(.+„^) (z). 

The attribute ^^oblique^^ emphasizes the presence of an additional subspace which acts for the 
computation of the thresholded solution. By using results in Subsection 2.3] (see also 
II. 2-3]) we can already infer that 

\\St'''"'''{zi;u2) - St'''''''{z2;u2)\\H < \\zi - Z2\\n, for all zi,z2 e V,. 

4.3. Convergence of the subspace minimization. In light of the results of the previous 
subsection, the iterative algorithm (|4.3p can be equivalently be rewritten as 

^ St'^^yHuf^ +nv,T*{g -Tu2-Tu[%U2). (4.8) 
In certain cases, e.g., in finite dimensions, the iteration can be explicitely computed by 

"^'^ - St{u['^ + nv,T*{g - Tu2 - Tu'^P) + U2 - v^'^) - U2, 
where t^^^-* G V2 is any solution of the fixed point equation 

77 = 'Kv2PaK^{-n - {ui'' + 7TviT*{g - Tu2 - Tu^P) + W2)). 

The computation of •q'^^^ can be (approximatively) implemented by the algorithm (|4.7p . 

Theorem 4.9. Assume U2 E V2 and \\T\\ < 1. Then the iteration Ii4.8\ ) converges weakly to 
a solution £ V^' of J^. _?[ ) for any initial choice of u'l^ S . 

Proof. For the sake of completeness, we report the proof of this theorem, which follows the 
same strategy already proposed in the paper |19j , compare also similar results in |18j . In particular 
we want to apply Opial's fixed point theorem: 

Theorem 4.10 ([55]). Let the mapping A from Ti. to Ti. satisfy the following conditions: 
(i) A is nonexpansive: for all z,z' S Ti., \\Az — Az'Wt-i < \\z — z'Wy^; 
(ii) A is asymptotically regular: for all z Gli,, \\A"^^z — A"'z\\-}i — » 0, for n —^ 00; 
(Hi) the set J- = Fix^ of fixed points of A inTi. is not empty. 
Then for all z £1-1, the sequence (^"z)„gN converges weakly to a fixed point in J- . 

We need to prove that A{ui) := S"^''^!'^^ [ui+ttviT* {g — Tu2—Tui)\ U2) fulfills the assumptions 
of the Opial's theorem on Vi. 

Step 1. As stated at the beginning of this section, there exist solutions G Vf to (14. ip . 
With a similar argument to the one used to prove the equivalence of (i) and (ii) in Theorem 14.4) 
the optimality of u\ can be readily proved equivalent to 

e -TTViT*{g - Tu2 - Tu\) + rj + ad-}ii,ip{ul + U2), 

for some 77 G (V'2''')'. By adding and subtracting we obtain 

G Ml - ((u* + TTViT*{g - Tu2 - Tu\)) -?]) + q;(9^^'0(u* + U2), 



12 



M. FORNASIER AND C.-B. SCHONLIEB 



By applying the equivalence of (i) and (ii) in Theorem 14.41 we obtain that is a fixed point of 
the following equation 

ul = + TTv,T*{g - Tu2 ~ Tul)-U2), 

hence Fix A ^ 0. 

Step 2. The algorithm produces iterations which are asymptotically regular, i.e., ||itj^^^^ — 
u-^^Wt-c 0. Indeed, by using ||r|j < 1 and C := 1 — ||r|p > 0, we have the following estimates 



J{u['^ + U2) ^ J,'{u['^ + U2,u[''^) 

>Jf{uf^''+U2,uf^) 

>J,%uf+'^+U2,u['+'^)^J{u['+'^+U2), 

See also (j5.3p and (|5.4p below. Since {J{u{^ + U2))i is monotonically decreasing and bounded 
from below by 0, necessarily it is a convergent sequence. Moreover, 

J{u'l^ + U2) - M'"-'^ + U2) > Cll^r - u['^ II?,, 

and the latter convergence implies H^^^^"^' — Wi^'Hw — * 0. 

Step 3. We are left with showing the nonexpansiveness of A. By noncxpansiveness of 
5^',Vi,V2(..u2) we obtain 

WSt'^^'^^Hul + 7Tv,T*{g - Tu2 ~ Tul;u2) - St^^'^'^iuj + ^v,T*{g - Tu2 - Tul U2)\\h 
< \\u\ + nv,T*{g - Tu2 - Tu\) - [ul + TTv,T*{g - Tu^ - Tul)\\n 
= - 7tv,T*Tttv,){u\ ^ ul)\\n 



< \\u] - 



In the latter inequality we used once more that ||T|| < 1. □ 

We do not insist on conditions for the strong convergence of the iteration (|4.8p . which is not 
a relevant issue, see, e.g., [IHl [H] for a further discussion in this direction. Indeed, the practical 
realization of (j3.5p will never solve completely the subspace minimizations. 

Let us conclude this section mentioning that all the results presented here hold symmetrically 
for the minimization on V2, and that the notations should be just adjusted accordingly. 

5. Convergence of the Sequential Alternating Subspace Minimization. We return 
to the algorithm (|3.5p . In the following we denote Ui = Tr^.u for i = 1,2. Let us explicitly express 
the algorithm as follows: Pick an initial Vi (BV2 3 u'l''"^ + itj"'^^"* £ li,'^ , for example 

= 0, and iterate 

(n+l.O) (ri,L) 



U 



- argmin„^,^^ Jl{u, + ^-^^^ 4"+^^^') £ = 0, . 



1 

(n+lfi) _ (n,M) 

(n+l.rn+l) . / (ri+l,L) {n+l,m)s „ n r i 

4 = argmm^^gy^ J2 (^1 +W2,M2 ) m = 0, . . . , M - 1 



(5.1) 



Note that we do prescribe a finite number L and M of inner iterations for each subspace respec- 
tively. In this section we want to prove its convergence for any choice of L and M . 

Observe that, for a G Vi and ||r|| < 1, 

\\u, - a||^ - \\Tu, - Ta\\^^ > C\\u, ~ a\\l^, (5.2) 
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for C = (1 - \\T\f) > 0. Hence 

J{u) = jf{u,Ui) < Jf{u,a), (5.3) 

and 

Jfiu, a) - Jf{u, u,) > C\\u, - a|||^. (5.4) 

Theorem 5.1 (Convergence properties). The algorithm in i5.1\) produces a sequence (w^"))„gN 
in TC^ with the following properties: 

(i) J(w(")) > J(u("+^^) for all n en (unless = ); 
(ii) lim„_oo - = 0; 

(Hi) the sequence (w'"^)„gN has subsequences which converge weakly in Ti. and in Ti^' endowed 
with the topology t"^ ; 

(iv) if we additionally assume, for simplicity, that dimTi < oo, (u^"'=')fcgN is a strongly con- 
verging subsequence, and its limit, then is a minimizer of J whenever one 
of the following conditions holds 

(a) ^{u[°^^ + 772) + ^(4°°^ + m) - + 4°°^) < ^iVi + m) for all ry, eV,,i = 1,2; 

(b) ip is differentiable at with respect to Vi for one i G {1,2}, i.e., there exists 
^^(m(°°)) := Q e {V,y such that 

. r V(4°°^+4"^+toi)-V'(4°°^+4'"^) , „ 
{Q,v,) = lim i ^ i — for all Vi^V^. 

Proof. Let us first observe that 

j7(«(")) J-f (,,(") +4"), «(")) = Jl5(^,("^^) +4"),,.("+i'°)). 

By definition of u^"^^'^' and the minimal properties of uj"^^'^' in (|5.ip we have 

From (|5.3p we have 

Putting in fine these inequahties we obtain 

j(^^("))>^(4"^'''' + 4"') 

In particular, from (|5.4p we have 

j(u(")) - :7(^1"+^'^) + 4")) > - II?,. 

After L steps we conclude the estimate 

J(«(")) > J(4"+i'^)+4")), 

and 



L-l 

{n+l,£+l) (n+l,f)||2 

^=0 



By definition of Wj"^^'^'' and its minimal properties we have 
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By similar arguments as above we finally find the decreasing estimate 

> Ji{uf+^'''^ + = (5.5) 

and 

CL-l Af-1 \ 

^ _ ^in+l,l) ||2^ ^ ^ _ 4"+!^") . (5.6) 

£=0 m=0 / 

From (15.51) we have J(u('')) > J(u(")). By the coerciveness condition (C) (M^"^)„gN is uniformly 
bounded in T-t^ , hence there exists a H-weakly- and r'^'-convergent subsequence (w("j^)jgN- Let us 
denote u^°°^ the weak limit of the subsequence. For simplicity, we rename such a subsequence by 
(-u("))„gN. Moreover, since the sequence ( J'(m^"'^))ti6N is monotonically decreasing and bounded 
from below by 0, it is also convergent. From ()5.6p and the latter convergence we deduce 

CL-l M~l \ 

Y.\\-r'^'^''-ur'''rn+Y: -0, n^oo. (5.7) 

1=0 m=0 / 

In particular, by the standard inequality (a^ + b'^) > ^{a + b)^ for a,b > and the triangle 
inequality, we have also 

0, n-^oo. (5.8) 
We would like now to show that the following outer lower semicontinuity holds 

G lim 9J(u(")) C 9J(u(°°'). 

n — ^oo 

For this we need to assume that 7i- weakly- and r'''— convergences do imply strong convergence in 
TL. This is the case, e.g., when dim(7i) < oo. The optimality condition for it^"^^'^^ is equivalent 
to 

e ^.1"+^'^^ - z^r^'^ + advM: + 4"'''^)(^/^"+'''^^ (5.9) 

where 

Analogously we have 

g 4"+!.^^) _ + adv,i^{- + ^i"+'''^))(4"+'''^^), (5.10) 

where 

(n + l) ^ (ri+l,A/-l) , _ rr*i^ ^^(ri+l,L) (ri+l,J\/-l) 



^2 '■~ ^2 



T^Vs?^ (ff - Tu\ - Tw^ ^). 



Due to the strong convergence of the sequence it*^"^ and by ()5.7|) we have the following limits for 
n — > oo 

^(n+l) (n + l,L) (n+1) ^ rp* / (oo) rp (oo)n -j^ 

^(n+1) {n+l,M) ('l + l) /■ rrn*/ rji (oo) rp (oo)n ^ t r 
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and 

^(„+i) ^ ^(„+i) _^ ^ _ T*{Tu^^^ - g). 

Moreover, we have 

a 

meaning that 

( 4i ',771 -ui + +U2 ) < ''/'('71 + Uo )' tor all 771 e v/i • 

a 

Analogously we have 

/ 1 ^(n+l) (n+l.A/)\ , ,/ (n+l.L) , (n+l.A/)x ^ ;/ , (n+l,L)-> r ,-, ^ ,^ 

( Oi ,'72-^2 ■ +"2 ■ )<V'('72+ui ), for aU 7726 V^- 

a 

By taking the limits for n — > c» and by (|5.7p we obtain 

+ <V'(^i+4°°')> for all 771 Gli. (5.11) 

a 

(--6,'?2-4°°')+^("l°°^) <V'(^2+«i°°^), for all 772 6^2. (5.12) 
a 

These latter conditions are rewritten in vector form as 



G 



( I ^^o.[dy,i,{-^ut\u'~r^)^dy,i,{-^ut\ut^)). (5.13) 



Observe now that 

If G ^ + ad-Hip {u^°°'^) then we would have the wanted minimality condition. While the inclusion 

easily follows from the definition of a subdifferential, the converse inclusion, which would imply 
from (|5.13p the wished minimality condition, does not hold in general. Thus, we show the converse 
inclusion under one of the following two conditions: 

(a) ip{u[°°^ + 772) + + m) - ^("1°°^ + 4°°^) < ^im + m) for all 77, G l/„ i = 1, 2; 

(b) 1/) is differentiable at 7*^°°^ with respect to 14 for one I G {1, 2}, i.e., there exists g^7/'(M(°°') := 
Ci G {ViY such that 

{Ci,v,) = hm -^-t ^ ^ — for aU v.eVi. 

t^o t 

Let us start with condition (a). We want to show that 

{--£,, ri - ?/(°°)) + 7/;(7i(°°)) < 7/.(7/), for aU 77 G 
a 

or, equivalently, that 

(--Ci, 771 - 7.1°°') + (--6, ^72 - 4°°') + M^^^ + 4°°') < ^{m + m), for all 77, G V,, 
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By the differential inclusions (|5.1ip and (|5.12p we have 

a a 
hence 

a a 
< + 772) + ijiui"^^ +m)' V'(^i"' + "2°°^ for aU rj, e V,. 

An application of condition (a) concludes the proof of the wanted differential inclusion. 

Let us show the inclusion now under the assumption of condition (b). Without loss of gen- 
erality, we assume that -0 is differentiable at it(°°) with respect to V2. First of all we define 
ipiui, U2) ■= "0(1*1 + U2). Since is convex, by an application of [371 Corollary 10.11], we have 

- dnipiui + U2)}. 

Since is differentiable at with respect to V2, for any (Ci, G dip{u\, U2) — d-j-ci^iui +U2) 
we have necessarily (2 = g^0(it^°°'') as the unique member of dv2'>P{- + ) (u''^^ ) . Hence, the 
following inclusion must also hold 

o.(|)+c.(.,*o+.-)(..hx..*(.+.,r)(4-')) 

C ^ ^ + advixV2'4^{ui,U2) 
=ie + a9w0(w^°°'). 

□ 

Remark 5.2. Observe that, by choosing 771 = 772 = 0, condition (a) and (^'1) imply that 

^p{u[°°^) + ^(4"') < ^(4°°^ + 4°°^) 

The suhlinearity (^'2) finally implies the splitting 

Conversely, if il){vi) + 0(w2) = 0('yi + ^"2) for all Vi S Vi, i — 1,2, then condition (a) easily 
follows. As previously discussed, this latter splitting condition holds only in special cases. Also 
condition (b) is not in practice always verified, as we will illustrate with numerical examples in 
Section \7.2\ Hence, we can affirm that in general we cannot expect convergence of the algorithm to 
minimizers of J , although it certainly converges to points for which J is smaller than the starting 
choice J^{u^^'^). However, as we will show in the numerical experiments related to total variation 
minimization (Section \7.1^ , the computed limit can be very close to the expected minimizer. 



6. A Parallel Alternating Subspace Minimization and its Convergence. The most 
immediate modification to (|5.ip is provided by substituting u^"'^^ instead of u^"^^'^'' in the second 
iteration, producing the following parallel algorithm: 

(«+1.0) _ (n,L) 
1 ~ "1 

(n+l.e+l) _ . (n,M) (n+1, 



argmin^^g^^ Jfiu^ + ^ ^ = 0, . . . , i - 1 



u 

^in+m ^ ^(„,M) (6.1) 



^ argmin„^,^^ JHu^-'"^^ + U2, 4"""''™^) m = 0, . . . , Af - 1 
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Unfortunately, this modification violates the monotonicity property J{u'^^^) > and 
the overall algorithm docs not converge in general. In order to preserve the monotonicity of the 
iteration with respect to JT" a simple trick can be applied, i.e., modifying uj"^^'^'' + 

^(n+i,M) ^j^g average of the current iteration and the previous one. This leads to the following 
parallel algorithm: 



(n+l.O) (n.L) 



4"+^^^+^) = argmin„^,^^ Jf{u, + ^ = 0, 

(n+1,0) _ (n,M) 
"2 ~ "2 



.L-l 



(6.2) 



(Ti+l,m+l) 



argmm„^gy^ J2''(Mi '+M2,W2 ) m = 0, . . . , A'/ - 1 



(n+l) 



In this section we prove similar convergence properties of this algorithm as for (|5.1 



Theorem 6.1 (Convergence properties). The algorithm in 16. 2\) produces a sequence (M^"-')„gN 
in TC^ with the following properties: 

(i) J7(w(")) > J(u("+i)) for all n en (unless u^") = ); 
(it) lim„_oo||u("+i)-w(")||w = 0; 

(Hi) the sequence (u^"^)„gN has subsequences which converge weakly in 7i and in T-C^ endowed 
with the topology ; 

(iv) if we additionally assume that dimTi < oo, (w^"''')^^^ is a strongly converging subse- 
quence, and u^°°^ is its limit, then u^°°^ is a minimizer of J' whenever one of the following 
conditions holds 

(a) V(wl°°' + V2) + V^(4"^ + Vi) - i^iu^r^ + 4°°^) < + m) for all r^,eV„i^ 1, 2; 

(b) i/j is differentiable at with respect to Vi for one i £ {1,2}, i.e., there exists 
^V(m^°°') := C» e {Vi)' such that 

(G, -y^) = Imi i i-^, for all ViGV. 

Proof. With the same argument as in the proof of Theorem 15. li we obtain 



L-l 



(n+l/+l) (n+l,f)||2 



and 



M-1 



(n+l,m+l) (n+l,m)ii2 



Un 



m=0 



Hence, by summing and halving 



> 



C 



/L-l 

E ii4^ 



(n+l,£)||2 



i?.+ Eii4 



(n+l,?n + l) (n+l,?n)||2 



By convexity we have 



H 



T 



{n+l.L) (n+l.M) 



'■2 



) + U 



(n) 
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Moreover, by sublinearity (^'2) and 1-homogencity (5'3) we have 

By the last two inequalities we immediately show that 

hence 



„ /L-l A/-1 \ 



Since the sequence (j7'(u'"''))rigN is monotonically decreasing and bounded from below by 0, it is 
also convergent. From (|6.3p and the latter convergence we deduce 

CL-l M~l \ 

Ehr^"^^^-4"'^^"'ll?.+ E 114"+^'"+^' -4"-*-^^"^ III. -0, n^oo. (6.4) 
l=Q m=0 ) 

In particular, by the standard inequality (a^ + \?) > ^(a + b)^ for a,b > and the triangle 
inequality, we have also 

E 11"!"+'''+'^ - u["+''''>rH > c" (e 114"+'-'+'^ - 4"+'''^ik) 

(n+l,L) „,(n)||2 



>C"\\u\ 



c"ii4"+'''^^ + 4"'-24"^ii^. 



Analogously we have 



2 



£=0 



By denoting C" = \C"' we obtain 



„ /L-l A/-1 

t-- / n (ri+l/+l) (ti+1,£)||2 I II (n+l/+l) (n+l,£)||2 



^ Eii-1 ^^-"r^^'iiUE 



2 

CO 



1=0 1=0 
I 

-|l4"+'-^)+4"+'''')+«(")-2«(")||2, 



> 

> CC"'||u("+i' -u("'||^. 
Therefore, we finally have 

||y(") ^0, n^oo. (6.5) 

The rest of the proof follows analogous arguments as in that of Theorem 15. II □ 

7. Applications and Numerics. In this section we present two nontrivial applications of 
the theory and algorithms illustrated in the previous sections to Examples 12. II 
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7.1. Domain decomposition methods for total variation minimization. In the follow- 
ing we consider the minimization of the functional J in the setting ofExamplcs l2.1l l. Namely, let 
r2 C R'^, for d = 1, 2, be a bounded open set with Lipschitz boundary. We are interested in the case 
when n = L'^in), = BV{n) and il'{u) = V{u, fl). Then the domain decomposition n = f]iUrJ2 
as described in Examples 12.21 1 induces the space splitting into Vi ;= {u e : supp(u) C fli}, 

and T^/ = BV{n) nVi, i = 1,2. In particular, we can consider multiple subspaces, since the 
algorithms and their analysis presented in the previous sections can be easily generalized to these 
cases, see [351 Section 6]. As before uq. — Try^ (u) = 1^^" is the orthogonal projection onto Vi. 
To exemplify the kind of difficulties one may encounter in the numerical treatment of the interfaces 
dQi n dilj , we present first an approach based on the direct discretization of the subdifferential 
of J7 in this setting. We show that this method can work properly in many cases, but it fails in 
others, even in simple ID examples, due to the raising of exceptions which cannot be captured by 
this formulation. Instead of insisting on dealing with these exceptions and strengthening the for- 
mulation, we show then that the general theory and algorithms previously presented work properly 
and deal well with interfaces both for d = 1,2. 

7.1.1. The "naive" direct approach. In light of (|3.4p . the first subiteration in (|3.5[) is 

given by 



4"+^) « argmin„^gv-ir(«i+4"Vfflli^(o)+2a (\D{v,)m,) + [ 



1 


+ (")- 











We would like to dispose of conditions to characterize subdifferentials of functionals of the type 

r{u) = \D{u)\ (n) + f \u+ - z\ dHd-i, 



interface condition 

where 9 C dft, in order to handle the boundary conditions that are imposed at the interface. Since 
we are interested in emphasizing the difhcultics of this approach, we do not insist on the details 
of the rigorous derivation of these conditions, and we limit ourself to mention the main facts. 
It is well known [45l Proposition 4.1] that, if no interface condition is present. ^ e d\D{-)\{il){u) 
implies 

-V-(^) inn 

^ \ \/u \ ' 

on dn. 

The previous conditions do not fully characterize ^ G 9|D(-)|(r2)(u), additional conditions would 
be required [2|, I45j. but the latter are, unfortunately, hardly numerically implementable. This 
lacking approach is the source of the failures of this direct method. The presence of the interface 
further modifies and deteriorates this situation and for dY(u) ^ we need to enforce 

. dU,^, + [ \in + ws)^-z\-\u^-z\ > q, V«; G C^in). 




The latter condition is implied by the following natural boundary conditions: 



|Vu| 



on on \ 9, 
on 9. 



(7.1) 



Note that the conditions above are again not sufRcient to characterize elements in the subd- 
ifferential of r. 
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7.1.2. Implementation of the subdifFerential approach in TV — interpolation. 

Let D C il C M.'^ open and bounded domains with Lipschitz boundaries. We assume that a 
function g G L'^{fl) is given only on n\ D, possibly with noise disturbance. The problem is to 
reconstruct a function u in the damaged domain D d which nearly coincides with g on Q \ D. 
In ID this is a classical interpolation problem, in 2D has taken the name of "inpainting" due to 
its applications in image restoration. Tl^-interpolation/inpainting with fidelity is solved by 
minimization of the functional 

J(u) = \\ln\D{u-g)\\l,^^^+2a\Diu)m), (7.2) 

where 1si\d denotes the characteristic function oi fl \ D. Hence, in this case T is the multiplier 
operator Tu = We consider in the following the problem for d = 1 so that = (a, b) is an 

interval. We may want to minimize (j7.2|) iteratively by a subgradient descent method, 



1 + 
Vii<")| dn 

where 



-^■i^^) + '^Mu^"^-9) inn 
= on dn, 



(7.3) 



A(.) = ^^^ 
^ ^ |0 D. 

We can also attempt the minimization by the following domain decomposition algorithm: We split 
n into two intervals 17 = Jli U and define two alternating minimizations on fii and with 
interface 9 = dni n 91^2 

"'""r"'"' --V-(^g;^) + 2Ai(4"^~g) inn, 

Wt = o ondn,\e 



iv«ri 



and 







r 




1 


04" + ^) 


1 


dn 


iv^-'i 


dn 



-V.(^^) + 2A2(4"^-g) inn, 

\VU2 I 

= on aria \ 6* 

£91-1(4"+'^+ -4""^'^") one. 



In this setting denotes the restriction of u G BV{n) to il^. The fitting parameter A is also 
split accordingly into Ai and A2 on n, and n, respectively. Note that we enforced the interface 
conditions (|7.ip . with the hope to match correctly the solution at the internal boundaries. 



The discretization in space is done by finite differences. We only explain the details for the first 
subproblem on n, because the procedure is analogous for the second one. Let i = 1, . . . , N denote 
the space nodes supported in n,. We denote h = and u{i) := u{i ■ h). The gradient and the 
divergence operator arc discrctized by backward differences and forward differences respectively, 

Vu(z) = - u{i - 1)) 

V • u{i) = ^{u{i + l) - u{i)) 



|V^.|(^) = \/e2 + -L(^i(^)-u(^™l))2, 
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for i = 2, . . . , — 1. The discretized equation on ili turns out to be 



,("+!)/ 



4"+^'(^)-4"+^)(^-l)^ 



4"+i)(, + l)_4»+i)(,) 



1) 



with Ci{i) = + (^^"''(i) — — and i = 2, ... — 1. The Neumann boundary 

conditions on the external portion of the boundary are enforced by 



mil) 



cm 



ui(2). 



The interface conditions on the internal boundaries are computed by solving the following subd- 
ifferential inclusion 

-{u'('+^\n) - u["+^\n - 1)) e c'liN) h-d\- - u^r^^\N)). 

For the solution of this subdifferential inclusion we recall that the soft-thresholded u — Sa (x) p.3p 
provides the unique solution of the subdifferential inclusion £ [u — x) + ad\ ■ \ {u). We reformulate 
our subdifferential inclusion as 



G 



V - (4"^(iV) - u^"+^\n - 1))] + c^(iV)/i • d\ ■ \iv) 



with V := u^^\n) - M^"+^^(7V) and get 

= ^crw/.(4"^(^) - - 1)). 

Therefore the interface condition on 9 reads as u'(^'^^\n) — u'^\n) — v. 



In the left column of Figure 17.11 three one dimensional signals are considered. The right 
column shows the result of the application of the domain decomposition method for total variation 
minimization described above. The support of the signals is split in two intervals. The interface 
developed by the two intervals is marked by a red dot. In all three examples we fixed Aq = 1 and 
r = 1/2. The first example 7.1(a)|7!l(b) shows a step function which has its step directly at the 
interface of the two intervals. The total variation minimization (|7.3p is applied with = 0. This 
example confirms that jumps are preserved at the interface of the two domains. The second and 



third example 7.1(c)||7yi(f) present the behaviour of the algorithm when interpolation across the 
interface is performed, i.e., D 7^ 0. In the example 7.1(c)|7^1(d) the computation at the interface 
is correctly performed. But the computation at the interface clearly fails in the last example 



7.1(e)p\l(f) compare the following remark. 

Remark 7.1. Evaluating the soft thresholding operator at u'^\n) — u'"^^^\n — 1) implies 
that we are treating implicitly the computation of u^""*'^'' at the interface. Namely the interface 
condition can be read as 



,(«+!) 



(N) 



,(") 



2 (N) - e("+i) ■ [4"'(iv) - ?/i"+')(iv - 1) 

^sign(4"^(A^) - uJ"+''(A^ - 1)) • c'f\N)h] 



where 



e(n+i) 



1, 

0, 



|4"^(7V) - u[''+'\n - 1)1 - c^"^(iv)/i > 

otherwise . 



The solution of the implicit problem is not immediate and one may prefer to modify the situation 
in order to obtain an explicit formulation by computing S'c5'(Ar)?i(u2"'' (-/V) — u'i^\N — 1)) instead of 
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Fig. 7.1. Examples of TV — inpainting in ID where the domain was split in two. (a)-(f): A = 1 and r = 1/2 



Sc^{N)hiu2^\N) — u']'^~^^\n — 1)). The problem here is that, with this discretization, we cannot 
capture differences in the steepness of ui and U2 at the interface because u'f^\N) = u'^\n) for all 
n. Indeed the condition \u'^2^ (N) — itj"'' (N — 1) | — c^"^ iN)h > is never satisfied and the interface 
becomes always a Dirichlet boundary condition. Even if we change the computation ofc^^\N) from 

y^e2 + {u["\n) ~ u["\n - l)Y/K^ to a forward difference \J e"^ + {u^^'\n + 1) - u^^\N)flh^ 
(as it is indeed done in the numerical examples presented in Figure |7. the method fails when 
the gradients are equal in absolute value on the left and the right side of the interface. 

We do not insist on trying to capture heuristically all the possible exceptions. We can expect 
that this approach to the problem may become even more deficient and more complicated to 
handle in 2D. Instead, we want to apply the theory of the previous sections which allows to deal 
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with the problem in a transparent way. 

7.1.3. The novel approach based on subspace corrections and oblique threshold- 
ing. We want to implement the algorithm (|5.ip for the minimization of J . To solve its subit- 
erations we compute the minimizer by means of oblique thresholding. Denote U2 — u^^'^^\ 
Ml = and z = + TTv,T*{g - Tu2 - Tw^"+^'^^). We would like to compute the 

minimizer 



ui = argmin„gy-^ - ^;||i2(o) + 2a\D{u + U2)\{n) 



by 



for any S V2. It is known [13| that A'|£)(.)|(o) is the closure of the set 

{dive:Ce [cl{n)]\\ax)\<i Vxen}. 

The element 77 G V2 is a limit of the corresponding fixed point iteration (j4.7p . 

In order to guarantee the concrete computability and the correctness of this procedure, we 
need to discretize the problem and approximate it in finite dimensions, compare Examples 12.11 3 
and Remark 1431 2. 

In contrast to the approach of the previous section, where we used the discretization of 
the subdifferential to solve the subiterations, in the following we directly work with discrete 
approximations of the functional J. In dimension d = 1 we consider vectors u E Ti. := M^, 
u = {ui, M2, . . . , Mat) with gradient Ux G M.^ given by 



iUx)i = 



Ui+i — Ui if i < iV 
if i = TV, 



for i = I, . . . , N. In this setting, instead of minimizing 

J{u) := \\Tu~g\\l,,, + 2a\D{u)m), 
we consider the discretized functional 

j\u) := J2 (((^")» - S^f + MiuM) ■ 

l<i<N 

To give a meaning to {Tu)i we assume that T is applied on the piecewise linear interpolant u of 
the vector {ui)f^^ (we will assume similarly for d — 2). 

In dimension d = 2, the continuous image domain fl = [a, b] x [c, d] C is approximated 
by a finite grid {a = xi < . . . < xn = b} x {c = yi < . . . < xm = d} with equidistant step-size 
h = Xi-^-i — Xi = = ^^YT = J/j+i ^ Vj equal to 1 (one pixel). The digital image u is an element 
in H := R^""^^. We denote u{x^,yj) = Ui^j for i = 1, . . . , TV and j = 1, . . . , Af. The gradient Vii 
is a vector in. H x H given by forward differences 

with 



I Ui+i J - if i < N 

[0 ift = iV, 

Uij+i - Ui^j \i j < M 

if j = M, 
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for i = 1, . . . , N, j 



M. The discretizcd functional in two dimensions is given by 



l<iJ<N 



) 



with \y\ = \/yl + yl for every y = {yi,y2) e 



For the definition of the set Kiq 



{■)\m 



in finite dimensions we further introduce a discrete 



divergence in one dimension V- : 7i — > ?i (resp. V- : ?i x ?i — > 7i in two dimensions) defined, by 
analogy with the continuous setting, by V- = —V* (V* is the adjoint of the gradient V). That is, 
the discrete divergence operator is given by backward differences, in one dimension by 



and, respectively, in two dimensions by 




if f < i < TV 
if i = 1 



{pn^J - if 1< i < 

{p'')i,j if i = 1 

(py),,, - {py^j-i if i< J < M 



for every p = {p^,py) & H x H. 

With these definitions the set K\ 



||(-)x||fjv ''^'^ dimension is given by 
{V-p:p<EH, b,| < IVi = l,...,iV}. 
and in two dimensions if||v( )|| jv^a/ given by 

{V -p-.penxH, < 1 Vi = 1, . . . ,iV and j = 1, . . . , Af} . 

To highlight the relationship between the continuous and discrete setting we introduce a step- 
size h ^ 1/N in ID {h ~ min{l/iV, in 2D respectively) in the discrete definition of J by 
defining a new functional J^^ equal to h times the expression above. One can show that as 
/i — > 0, T— converges to the continuous functional J, see [T3]- In particular, piecewise linear 
interpolants Uh of the minimizers of the discrete functional do converge to minimizers of J . 
This observation clearly justifies our discretization approach. 

For the computation of the projection in the oblique thresholding we can use an algorithm 
proposed by ChamboUe in [13]. In two dimensions the following semi- implicit gradient descent 



algorithm is given to approximate PaK, 



V()|(S2 



Choose r > 0, let = and, for any n > 0, iterate 



(n+l) 



SO that 



(«) 



T (V(V-p' 



(") 



.9/a)) 



(V(V.p(")-.9/a)).u 



(„+i) ^ pg+r(V(V.pW-.9/a)) 
^'^^ l + T|(V(V-p(") -5/a)) 



For T < 1/8 the iteration aV • p^") converges to PaK^^^.)^^a){9) " 
Theorem 3.1]). 

For c? = 1 a similar algorithm is given: 



(7.4) 
oo (compare [13l 
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We choose r > 0, let p*^"' = and for any n > 0, 



_ p<"^+r((V.p 



PI 



(n) 



l + r|((V-p(") ■ 



(7.5) 



In this case the convergence of aV ■ p*-"' to the corresponding projection as n — > oo is guaranteed 
for T < 1/4. 

7.1.4. Domain decompositions. In one dimension the domain fl = [a.b] is spht into two 
intervals fli = [a, ^^'^ ^2 = [[^1 + "^^^ interface d^i n is located between 

i = \N/2] in fli and i = \N/2] + 1 in f22- In two dimensions the domain = [a, b] x [c, d] is split 
in an analogous way with respect to its rows. In particular we have fli = [a, [-^]] x [c, d] and 
= + li^] ^ [c, rf], compare Figure \77^ The splitting in more than two domains is done 

similarly: 

Set f2 = ill U . . . U r^A'', the domain ft decomposed into A/" disjoint domains fli, 
i = l,...,J\f. Set s = \N/Af] . Then 

ni = [l,s] X [c,d\ 
for i = 2 : A/" - 1 

Qi = [{i - l)s + l,is] X [c,d\ 
end 

= [{Af -l)s + l,N] X [c,d]. 



a — xi 

X\N/2] 

b = XN 



^2 



Fig. 7.2. Decomposition of the discrete image in two domains f2i and Q.2 with interface dQi n dQ2 



To compute the fixed point 77 of (j4.6p in an efheient way we make the following considerations, 
which allow to restrict the computation to a relatively small stripe around the interface. For 
U2 & V2 and z e Vi a minimizer iti is given by 

ui = argmin„gyj|u - ^11^2(^2) + 2a\D{u + U2)\{VL). 

We further decompose VI2 ^ ^2^ {^2 \ ^2) with 9572 H dili = 8^2 H dfti, where CI2 C il2 is a 
neighborhood stripe around the interface dil2 n dfti, as illustrated in Figure [7751 By using the 
splitting of the total variation p.4p we can restrict the problem to an equivalent minimization 
where the total variation is only computed in fli uCt2- Namely, we have 



ui = argmm„gyJ|M - 



lL2(n) 



2a\D{u + U2)\{Qi U O2) 



Hence, for the computation of the fixed point rj £ V2, we need to carry out the iteration 
^{m+i) _ 7rv2PaKio(.)i(u)(v^"^^ — z + U2) Only in fii U tl2- By further observing that 77 will be 
supported only in O2, i.e. rj{x) = in Jli, we may additionally restrict the fixed point iteration 
on the relatively small stripe 51i U 5I2, where Oi C Hi is an neighborhood around the interface 
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ill 

(12 



n2\n2 



Fig. 7.3. Computation of ri only in the stripe Qi U r22- 



from the side of fii. Although the computation of rj restricted to Cti U CI2 is not equivalent to 
the computation of ij on whole fii U $72, the produced errors are in practice negligible, because of 
the Neumann boundary conditions involved in the computation of Paif^^j ji^^ y Symmetrically, 
one operates on the minimizations on fl2. 

7.1.5. Numerical experiments in one and two dimensions. We shall present numerical 
results in one and two dimensions for the algorithm in ()5.ip , and discuss them with respect to the 
choice of parameters. 

In one dimension we consider the same three signals already discussed for the "naive" approach 
in Figure FTTl In the left column of Figure [73] we report again the one dimensional signals. The 
right column shows the result of the application of the domain decomposition method (|5.ip for 
total variation minimization. The support of the signals is split in two intervals. The interface 
developed by the two intervals is marked by a red dot. In all three examples we fixed a = 1 
and T = 1/4. The first example 14.1114.31 shows a step function, which has its step directly at the 
interface of the two intervals. The total variation minimization (|5.1|) is applied with T — I. This 
example confirms that jumps are preserved at the interface of the two domains. The second and 



third example 7.4(c)||7.4(f) present the behaviour of the algorithm when interpolation across the 
interface is performed. In this case the operator T is given by the multiplier T = 1si\d, where D 
is an interval containing the interface point. In contrast to the performance of the interpolation 
of the "naive" approach for the third example. Figure |7.1(e)||7. 1 (f)[ ), the new approach solves the 
interpolation across the interface correctly, see Figure |7.4(e)|7.4(f)[ 

Inpainting results for the two dimensional case are shown in Figures 17.5117.61 The interface is 
here marked by a red line in the given image. In the first example in Figure [7751 the domain is split 
in two subdomains, in the second example in Figure 17.61 the domain is split in five subdomains. 
The Lagrange multiplier a > is chosen 10~^. The time-step for the computation of PaK^^^^.-i^^ii^ 
is chosen r = 1/4. The examples confirm the correct reconstruction of the image at the interface, 
preserving both continuities and discontinuities as wanted. Despite the fact that Theorem 15.11 
does not guarantee that the algorithm in (|5.ip can converge to a minimizer of J7 (unless one of the 
conditions in (iv) holds), it seems that for total variation minimization the result is always rather 
close to the expected minimizer. 

Let us now discuss the choice of the different parameters. As a crucial issue in order to compute 
the solution at the interface dfli n dil2 correctly, one has to pay attention to the accuracy up to 
which the projection PaXi^f^.-^if^Q-, is approximated and to the width of the stripe fli U CI2 for the 
computation of rj. The alternating iterations (j5.ip in practice are carried out with L = M = b 
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(d) 
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— Reconstruction 
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(f) 



Fig. 7.4. (a)-(f): Examples of the domain decomposition method for TV — denoising /inpainting in ID 
where the domain was split in two domains with a = 1 and t = 1/4 



inner iterations. The outer iterations are carried out until the error | J(u'^"+^)) — J(u'^"')| is of 
order 0(10"^*^). The fixed point ?/ is computed by iteration (|4.7p in maximal 10 iterations with 
initialization rjn^ = when n = I and 77^*^]^ = the 77 computed in the previous iteration, 
for n > 1. For the computation of the projection PaKi^(.ii(n) ChamboUe's algorithm (|7.4p we 
choose T = 1/4. Indeed ChamboUe points out in [T3] that, in practice, the optimal constant for 
the stability and convergence of the algorithm is not 1/8 but 1/4. Further if the derivative along 
the interface is high, i.e., if there is a step along the interface, one has to be careful concerning 
the accuracy of the computation for the projection. The stopping criterion for the iteration (|7.4p 
consists in checking that the maximum variation between pfj and p"^^ is less than 10~^. With 
less accuracy, artifacts on the interface can appear. This error tolerance may need to be further 
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decreased for a > very large. Furthermore, the size of the stripe varies between 6 and 20 pixels 
also depending on the size of a > 0, and if either inpainting is carried out via the interface or not 
(e.g., the second and third example in Figure failed in reproducing the interface correctly with 
a stripe of size 6 but computed it correctly with a stripe of size 20). 

7.2. Accelerated sparse recovery algorithms based on £i-minimization. In this sec- 
tion we are concerned with applications of the algorithms described in the previous sections to the 
case where A is a countable index set, Ti. = £2(A), and ip{u) = := X^AeA I'^aI, compare 

Examples 12.11 2. In this case we are interested to the minimization of the functional 

Jiu):=\\Tu-g\\l^^^+2a\\u\U,. (7.6) 

As already mentioned, iterative algorithms of the type (|3.ip can make the job, where §i = 
is the soft-thresholding. Unfortunately, despite its simplicity which makes it very attractive to 
users, this algorithm does not perform very well. For this reason the "domain decomposition" 
algorithm (|3.2p was proposed in [55], and there we proved its effectiveness in accelerating the 
convergence and we provided its parallelization. Here the domain is the label set A which is 
disjointly decomposed into A = Ai U A2. This decomposition produces an orthogonal splitting 
of ^2(A) into the subspaces Vi = •^2'(^) •= ^ ^2(A) : supp(u) C AJ, z = 1,2. We want to 
generalize this particular situation to an arbitrary orthogonal decomposition: 
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Let Q be an orthogonal operator on ^2(A). With this operator we denote Qa; Q''^e^ii w FinaUy 
we can define Vi :~ (5a;^2(A) for i ~ I, . . . ,7V. In particular, we can consider multiple subspaces, 
i.e., M > 2, since the algorithms and their analysis presented in the previous sections can be 
easily generalized to these cases, see [29l Section 6]. For simplicity we assume that the subspaces 
have equal dimensions when dimTi < oo. Clearly the orthogonal projection onto Vi is given by 
T^Vi ~ QAiQ\ - Differently from the domain decomposition situation for which Q — I and 

Af AT 

W^'^vMle, ^^WnvMlii, (7-7) 

1=1 i=l 

for an arbitrary splitting, i.e., for Q ^ I, (|7.7p is not guaranteed to hold. Hence, an algorithm as in 
p.2p cannot anymore be applied and one has to use (jS.ip or (|6.2[) instead. In finite dimensions there 
are several ways to compute suitable operators Q. The constructions we consider in our numerical 
examples are given by Q as the orthogonalization of a random matrix Q, e.g., via Gram-Schmidt, 
or the orthogonal matrix Q = V provided by the singular value decomposition of T = UDV*. 
Of course, for very large matrices T, the computation of the SVD is very expensive. In these 
cases, one may want to use the more efficient strategy proposed in [38] , where Q is constructed by 
computing the SVD of a relatively small submatrix of T generated by random sampling. 

The numerical examples presented in the following, refer to applications of the algorithms for 
the minimization of where the operator T is a random matrix 200 x 40 with Gaussian entries. 

7.2.1. Discussion on the convergence properties of the algorithm. It is stated in the 
Theorems 15.11 and 16.11 that the algorithms (|5.1[) or (|6.2[) may not converge to a minimizer of 
for an arbitrary orthogonal operator Q, while, in reason of (j7.7|) and condition (a) in Theorem 1 5. II 
(iv), such convergence is guaranteed for Q = I. 




standard iter, threstiolding 
subspace correction 
domain deccmpositicn 



■ 50 100 150 200 250 

Number of iterations 

Fig. 7.7. We show the application of the algorithms 12. 1\) . \5. and 13. for the minimization of J where T 
is a random matrix 200 X 40 with Gaussian entries. We considered in this example Af = 5 subspaces Vi, a = 0.005, 
30 external iterations and 30 internal iterations for the minimization on each Vi, i = 1, . . . , 5. We fixed a maximal 
number of 20 iterations in the approximate computation of the auxiliary rj's in j^. 7| ). While i.3. 2\l converges to a 
minimizer of J , this is not the case for i5.Hl . 



In Figure 17.71 we can illustrate the different convergence behavior for Q 7^ / and for Q = /, by 
a comparison with the standard iterative thresholding algorithm p.ip . Nevertheless, in several 
situations the computed solution due to (|5.ip or (|6.2p for Q / / is very close to the wanted 
minimizer, especially for a > relatively small. Moreover, it is important to observe that the 
choice of a suitable Q, for example the one provided by the singular value decomposition of T, does 
accelerate the convergence in the very first iterations, as we illustrate in Figure [7?8l In particular, 
within the first few iterations, most of the important information on the support of the minimal 
solution u is recovered. This explains the rather significant acceleration of the convergence shown 
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Fig. 7.8. We show the application of the algorithms 12. 1\) . \5.1\j . and 13.2(1 . for the minimization of J' where T 
is a random matrix 200 X 40 with Gaussian entries. We considered in this example Af = 5 subspaces Vi, a = 0.005, 
30 external iterations and 30 internal iterations for the minimization on each Vi, i = 1, . . . , 5. We fixed a maximal 
number of 20 iterations in the approximate computation of the auxiliary rj's in ^4- ^™ the first few iterations 
15. 1\) converges faster than 13.2(1 . 



in Figure 17.91 obtained by combining few initial iterations of the algorithm (|5.ip for the choice 
oi Q = V with successive iterations where the choice is switched to Q = / in order to ensure 
convergence to minimizers of This combined strategy proved to be extremely efficient and it 
is the one we consider in the rest of our discussion. 




Fig. 7.9. We show the application of the algorithms \2. (3.2(1 . and \5. j|) where the first 4 external iterations 
are performed with Q = V , followed by iterations where Q = I. Again the minimization of J' is performed assuming 
that T is a random matrix 200 X 40 with Gaussian entries and the same parameters as in the previous figures. The 
starting acceleration due to the initial choice of Q = V allows to recover sufficient information on the support of 
the sparse minimal solution, so that the following iterations for Q = I do already perform significantly better than 
El). 



We developed further experiments for the evaluation of the behavior of the algorithm ()5.ip with 
respect to other parameters, in particular the number of inner iterations for the minimization on 
each Vi, i = 1, . . . , A/", and the number A/" of subspaces. In Figure [7. 101 we show that by increasing 
the number of inner iterations we improve significantly the convergence with respect to the outer 
iterations. Of course, the improvement due to an increased number of inner iterations corresponds 
also to an increased computational effort. In order to counter-balance this additional cost one may 
consider a larger number of subspaces, that in turns implies a smaller dimension of each subspace. 
Indeed, note that inner iterations on subspaces with smaller dimension require a much less number 
of algebraic operations. In Figure [7.111 we show that by increasing the number of subspaces and, 
correspondingly, the number of inner iterations we do keep improving the convergence significantly. 
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Fig. 7.10. We show the application of the algorithms i2.1\ } and i5.1\ } (with the switching from Q = V to 
Q = I as in the previous figure ) for the minimization of J where T is a random matrix 200 X 40 with Gaussian 
entries. We considered in this application of (5. 1\} Af = 10 subspaces Vi, a = 0.005, 50 external iterations and an 
increasing number of inner iterations for the minimization on each Vi, i = 1, . . . , 5. We fixed a maximal number 
of 20 iterations in the approximate computation of the auxiliary rj's in j^. 7| ). We can observe that by increasing 
the number of inner iterations we can significantly improve the rate of convergence with respect to the external 
iterations. 



Hence, the parallel algorithm ()6.2p adapted to a large number of subspaces performs very fast as 
soon as the inner iterations are also increased correspondingly. 
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Fig. 7.11. We show the application of the algorithms i2.1\} and i5.1\} (with the switching from Q = V to 
Q = I after 4 external iterations ) for the minimization of J where T is a random matrix 200 X 40 with Gaussian 
entries. We considered in this application of iS.lil an increasing number M = 2,4,10,50 of subspaces Vi and 
correspondingly an increasing number, 2,4,40,80, of inner iterations. Again we fixed a = 0.005, 50 external 
iterations, and a maximal number of 20 iterations in the approximate computation of the auxiliary rj 's in {4- 'T^ ■ 
We can observe that by increasing the number of subspaces and inner iterations we can significantly improve the 
rate of convergence with respect to the external iterations. 



8. Conclusion. Optimization of functionals promoting sparse recovery, e.g., £i-minimization 
and total variation minimization (where the sparsity is at the level of derivatives), were proposed 
in order to extract few significant features of the solution originally defined in very high dimen- 
sions. As a matter of fact, these minimizations cannot be performed by ordinary methods when 
the dimension scale is extremely large, for speed, resources, and memory restrictions. Hence, do- 
main decomposition or subspace correction methods have to be invoked in these cases. Our work 
contributes to remedy the lack of such methods for these specific problems. We introduced parallel 
and alternating optimization algorithms on sequences of orthogonal subspaces of a Hilbert space, 
for the minimization of energy functionals involving convex constraints coinciding with semi-norms 
for a subspace. We provided an efficient numerical method for the implementation of the algo- 
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rithm via oblique thresholding, defined by suitable Lagrange multipliers. It is important to notice 
that, on the one hand, these algorithms are realized by re-utilizing the basic building blocks of 
standard proximity-map iterations, e.g., projections onto convex sets, no significant complications 
in the implementations occur. On the other hand, several tricks can be applied in order to limit 
the computational load produced by the multiple iterations occurring on several subspaces (com- 
pare subsection 1 7 . 1 . 4| ) . We investigated the convergence properties of the algorithms, providing 
sufficient conditions for ensuring the convergence to minimizers. We showed the applicability of 
these algorithms in delicate situations, like in domain decomposition methods for singular elliptic 
PDE's with discontinuous solutions in ID and 2D, and in accelerations of ^i-minimizations. The 
numerical experiments nicely confirm the results predicted by the theory. 
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